Goto

Collaborating Authors

 qualcomm cloud ai 100


Serving LLMs in HPC Clusters: A Comparative Study of Qualcomm Cloud AI 100 Ultra and NVIDIA Data Center GPUs

Sada, Mohammad Firas, Graham, John J., Khoda, Elham E, Tatineni, Mahidhar, Mishin, Dmitry, Gupta, Rajesh K., Wagner, Rick, Smarr, Larry, DeFanti, Thomas A., Würthwein, Frank

arXiv.org Artificial Intelligence

The rapid proliferation of large language models (LLMs) has fundamentally transformed scientific computing, enabling breakthroughs across domains from computational biology to materials science. As these models scale to hundreds of billions of parameters, high-performance computing (HPC) facilities face mounting challenges in providing sustainable, cost-effective inference capabilities to diverse research communities. Traditional GPU-centric approaches, while delivering exceptional throughput, present significant barriers in terms of power consumption, cooling requirements, and capital investment, particularly problematic for shared research cyberinfrastructures serving hundreds of concurrent users. The National Research Platform (NRP) exemplifies these challenges and opportunities. As a federated Kubernetes-based infrastructure supporting over 300 research groups across over 100 sites, the NRP must balance competing demands: delivering high-performance AI capabilities while managing constrained power budgets, enabling fine-grained resource allocation for multi-tenant workloads, and providing cost-effective access to emerging AI models for diverse scientific applications [1, 2]. 1


More choices to simplify the AI maze: Machine learning inference at the edge

#artificialintelligence

Thinking about artificial intelligence (AI) infrastructure can feel a bit like finding your way through a maze – winding your way from data collection to solution development, to creating value through new insights, equipped with the right servers, compute, and storage to help you on your way. When designing infrastructure for your AI solutions, a helpful way to navigate the maze of choices is to begin with your end-to-end workflow – from data generation to solution deployment and value creation. Considering your requirements at each stage can make it easier to select the right infrastructure for your needs. We at HPE can help along this journey by offering a broad portfolio of choices in servers and storage so that you can optimize your deployments – from data center to edge. The latest choice comes in the form of the first product server from HPE based on a specialized AI processor: the HPE Edgeline EL8000 platform with the Qualcomm Cloud AI 100 accelerator.


Qualcomm Stakes Beachhead In Artificial Intelligence With Foxconn Gloria AI Edge Box

#artificialintelligence

When most folks think of Qualcomm, the first technologies that likely come to mind are the company's industry-leading mobile platform system-on-chips for smartphones, as well as the company's end-to-end 5G connectivity solutions. However, whether you consider applications like image recognition, speech input, natural language translation or recommendation engines, modern smartphone platforms typically require a lot of artificial intelligence (AI) processing horsepower as well. As such, after years of developing silicon and software platform solutions for mobile AI applications, it stands to reason that Qualcomm has an opportunity to bring its AI accelerator technology to other intelligent edge devices and the cloud. And that's just what's happening with Qualcomm's Cloud AI 100 inference accelerator portfolio, as evidenced by the company's recent joint announcement with Foxconn, one of the largest electronics contract manufacturers and ODMs in the world. Foxconn's Industrial Internet division has launched a new AI-enabled machine vision platform called Gloria.